---
title: Team Collaboration on Data Management
description: Best practices for managing datasets collaboratively within your team.
---

Effectively managing datasets for prompt evaluation and testing often involves collaboration. Here are tips for working with datasets as a team in Latitude:

## 1. Naming Conventions

- **Use Clear Names**: Adopt a consistent naming convention for datasets to indicate their purpose, source, or version (e.g., `Q3_Support_Tickets_Batch`, `Product_Desc_Golden_v2`, `Generated_Edge_Cases_Billing`).
- **Add Descriptions**: Utilize the description field when creating or managing datasets to provide context about the data source, purpose, and key columns.

## 2. Centralized vs. Distributed Datasets

- **Central Golden Datasets**: Maintain shared "golden datasets" for critical regression testing that everyone uses.
- **Individual/Team Scratchpads**: Allow team members to create temporary or experimental datasets (e.g., by saving filtered logs) for specific investigations without cluttering the main list. Establish a process for promoting useful scratchpad datasets to shared status.

## 3. Leveraging Logs

- **Saving Logs**: Encourage team members to [save interesting or problematic logs](/guides/datasets/overview#3-saving-logs-as-datasets) as small datasets for discussion or targeted evaluation.
- **Filtering**: Use metadata and filtering in the Logs view collaboratively to identify specific subsets of interactions to save as datasets (e.g., "all logs where sentiment evaluation < 3 for prompt X").

## 4. Roles and Permissions

- Understand how user roles (Admin, Member, Viewer) affect dataset management permissions within Latitude (e.g., who can create, edit, delete).
- Assign roles appropriately based on team responsibilities.

## 5. Communication

- **Announce New Datasets**: Inform the team when significant new shared datasets (especially golden datasets) are created or updated.
- **Discuss Usage**: Clarify which datasets should be used for specific types of evaluations or testing.
- **Review Process**: If curating golden datasets, establish a review process before finalizing them.

## 6. Balancing Synthetic vs. Real Data

- **Synthetic Data**: Useful for quickly generating diverse inputs or testing specific formats. Use the [dataset generator](/guides/datasets/overview#2-generating-synthetic-data) for bootstrapping.
- **Real Data (Logs)**: Essential for evaluating performance on actual user interactions. Regularly create datasets from recent logs.
- **Combined Approach**: Often, a mix is best. Use real logs and supplement with generated or manually crafted data to cover specific edge cases or scenarios not yet seen in production.
- **Discuss Strategy**: Decide as a team the right balance based on the prompt's goals and potential risks.

By establishing clear practices for naming, sharing, and curating datasets, your team can work together more effectively to leverage data for prompt improvement.

## Next Steps

- Review [Creating and Using Datasets](/guides/datasets/overview)
- Learn about [Golden Datasets for Regression Testing](/guides/datasets/golden-datasets)
- Integrate datasets into your [Evaluation Workflows](/guides/evaluations/integrating-evaluations-workflow)
